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BACKGROUND OF THE INVENTION 

1 . Field of the Invention 

This invention relates to a method and system for synthesizing a 
circuit representation of a circuit into a new circuit representation having greater 
unateness. 

2. Background Art 

The number of transistors that can be fabricated in a single IC has 
been growing exponentially over the last three decades. A well-known example is 
the Intel series of microprocessors. Intel's first commercial microprocessor, the 
4004, was built with 2,300 transistors in 1971, whereas a recent Intel 
microprocessor, the Pentium III, introduced in 1999 contains 9.5 million transistors. 
The clock frequency of the microprocessors also has dramatically increased from the 
4004's 0. 1 MHZ to the Pentium Ill's 550MHz. The 1998 International Technology 



Roadmap for Semiconductors developed by the Semiconductor Industry Association 
(SIA) predicts that the transistor count and the clock frequency of ICs will grow 
even faster in the next decade. 

It is thus becoming extremely difficult and time-consuming to design 
all the components in a complex IC from scratch, verify their functional and timing 
correctness, and ensure that overall performance requirements are met. To solve 
this challenging problem, a "design reuse" methodology is being widely introduced, 
which integrates large standardized circuit blocks into a single IC called a system 
on a chip (SOC). An SOC integrates a set of predesigned "off-the-shelf" blocks to 
build the entire system on a single chip, just as off-the-shelf ICs have been used to 
build a system on a board. The SOC methodology allows IC designers to focus on 
the interfaces linking the predesigned blocks. Thus it saves a tremendous amount 
of time that the designers would have spent creating all the blocks from scratch, and 
verifying their correctness. For this reason, the SOC design approach is becoming 
increasingly popular. 

A large portion of the reused blocks in an SOC are intellectual 
property (IP) circuits, which are also called cores or virtual components, and are 
often provided by third party vendors. The IP providers typically transfer their 
designs to SOC designers in a way that hides the key design details of the IP circuits 
and so protect the IP provider's investment in the IP designs. The IP circuits that 
have been developed so far cover numerous functions and support many different 
IC technologies. Additionally, the number and scope of the available IP circuits are 
rapidly growing. 

IP circuits are currently available in three different forms known as 
hard, soft, and firm. Hard IP circuits are provided in the form of complete layouts 
that are optimized and verified by the IP providers for a particular IC technology. 
Therefore, hard IP circuits can save time in all SOC design steps, but cannot be re- 
optimized by system designers for other technologies. The intellectual property of 
hard IP circuits includes all the implementation details and is protected by providing 
the system designers only with the circuit's high-level behavioral or functional 



specifications. Soft IP circuits are provided in the form of register-transfer level 
(RTL) descriptions, which define the circuit's behavior using a set of high-level 
blocks. These blocks can be converted by system designers to lower-level designs 
at the gate and physical levels. Thus soft IP circuits can be optimized for a variety 
of IC technologies and performance requirements, while they can save SOC design 
time in the high-level design and verification steps. Their RTL designs are 
considered to be the intellectual property contents of soft IP circuits. Finally, firm 
IP circuits are provided as netlists or gate-level descriptions. They allow the system 
designers to optimize the IP circuit's physical design such as cell placement and 
routing for various IC technologies. They provide the major advantages of both 
hard and soft IP circuits— they save system design time while allowing the flexibility 
of retargeting the IP circuits at various IC technologies. Both their RTL and gate- 
level designs are considered to be the intellectual property contents of firm IP 
circuits. While hard IP circuits are primarily aimed at ASIC designs, some soft and 
firm IP circuits are aimed at SOCs implemented using field programmable gate 
array (FPGA) technology. 

Although the advancement of IC technology allows extremely large 
designs to be integrated in a single chip, today's IC technology presents major 
challenges to the existing design and testing methodologies. For example, testing 
requirements for complex digital circuits are becoming increasingly tighter. Using 
traditional processes to synthesize the implementation of digital and other circuits 
often leads to circuits that are either inefficient in meeting testing requirements (i.e., 
unreasonably large test sets, etc.) or cannot satisfy design constraints (i.e., delay, 
area limits, etc.). 

The unateness of a circuit has a substantial impact on circuit 
testability and performance. Unate variables of a circuit representation z are 
variables that appear only in complemented or uncomplemented form in z's minimal 
two-level expressions such as sum of products (SOP) or product of sums (POS) 
expressions; binate variables are non-unate. For example, z x = ab + ac + bed is 
a minimal SOP expression for the four-variable function z u in which a and d are 
unate, and b and c are binate. 



The majority of circuit representations are, in nature, binate, and 
it is often difficult to synthesize these binate functions into a circuit implementation 
that can be efficiently tested for manufacturing defects and operate at very high 
speeds. 

For example, a high-speed circuit implementation known as "domino 
logic" requires that the circuit to be implemented be unate. Therefore, binate circuit 
functionality to be implemented in domino logic must be decomposed into unate 
circuit implementations. Similarly, static CMOS logic implementations become 
efficient if unate circuit implementations are extracted from an original binate circuit 
prior to implementation. Datapath logic circuits such as adders, subtracters, 
comparators, and ALUs are good examples of applications where carry generation 
functions (i.e., unate functions) are extracted from a larger (often binate) function 
and implemented in high-speed circuit structures. 

Another advantage of unate circuit implementations is their relatively 
small universal test set (i.e., the set of minimal true and maximal false test vectors 
for a function z). These test vectors have the useful property that they can detect 
all multiple stuck-at faults in any implementation of z. Universal test sets guarantee 
a very high coverage of manufacturing defects in a vast range of implementations 
for given circuit functionality. The universal test sets for binate circuit 
implementations tend to become excessively large. In addition, the unateness 
property enables the generation of test vectors from the behavioral functions of 
circuits before their implementations are actually executed. 

Existing functional decomposition processes are not suited to the goal 
of decomposing a binate circuit representation into a small set of unate subfunctions. 
One existing functional decomposition process called kernel extraction is aimed at 
multi-level logic synthesis. The kernels of a circuit representation/ are defined as 
fs cube-free primary divisors or quotients. For example, the kernels off = (a + 
b + c)(d + e)f + bfg + h include d + e, d + e + g, and a + b + c. This 
decomposition process employs algebraic division operations with the kernels 
serving as divisors or quotients to/. Here algebraic division represents/by a logic 



expression of the form / = pq + r. The kernels of / can be binate, so fs 
subfunctions p, q, and r can also be binate. In addition, kernel extraction often leads 
to an excessive number of subfunctions, and so is not practical for unate 
decomposition. 

Another existing decomposition process is Boole-Shannon expansion, 
which represents circuit implementation /by xf x + xf- where f x is the cofactor of 
/ with respect to x. Cofactor f x is defined as the subfunction of / obtained by 
assigning 1 to variable x in/. Boole-Shannon expansion has been widely used for 
binary decision diagram (BDD) construction, technology mapping and field 
programmable gate array synthesis. Boole-Shannon expansion is unsuited to the 
goal of obtaining a small set of unate circuit implementations, however, since it may 
only make the child functions/, and /- unate, while always expressing the parent 
function in a binate form. When applied repeatedly, Boole-Shannon expansion can 
also produce an unnecessarily large number of subfunctions, as each is created by 
eliminating only one binate variable at a time. 

Finally, a disjoint or disjunctive decomposition represents a boolean 
function fiX, Y) in the form hig^X,), g 2 (X 2 ), g k (XJ, Y), where X x , X 2 , X k , 
and Y are disjoint (i.e., non-overlapping) variable sets. This decomposition is 
relatively easy to compute, and is sometimes used for logic synthesis problems 
where the interconnection cost dominates the circuit's overall cost. It has the 
drawback that many circuit representations cannot be disjointly decomposed, and, 
like Boole-Shannon expansion, it can make the parent function h binate. Thus, the 
disjoint decomposition technique is also not appropriate for our unate decomposition 
goal. 

SUMMARY OF THE INVENTION 

One object of the present invention is to provide a method and system 
for efficiently synthesizing a circuit representation into a new circuit representation 
(i.e., circuit implementation) having greater unateness. Notably, those of ordinary 
skill in the relevant art will appreciate that the present invention may be 



implemented or applied in a variety of circumstances to synthesize circuits beyond 
those discussed, by way of example, in the preceding Background Art. 

One advantage of circuit implementations that possess unateness is 
their relatively small universal test set (i.e., the set of its minimal true and maximal 
false test vectors). Universal test sets guarantee a very high coverage of 
manufacturing defects in a vast range of implementations for a give function. 
Unlike universal test sets for highly unate circuit implementations, the universal test 
sets for largely binate circuits tend to become excessively large. In addition, the 
unateness property enables the generation of test vectors from the behavioral 
functions of circuits before their implementations are actually executed. 

Another advantage of unate circuit implementations is their low chip 
area. Yet another advantage of unate circuits is their ability to operate at very high 
speeds. For example, a low-area high-speed circuit implementation known as 
"domino logic" requires that the boolean function to be implemented be unate. 
Therefore, binate functions to be implemented in domino logic must be decomposed 
into unate functions prior to implementation. Similarly, static CMOS logic 
implementations become efficient if unate functions are extracted from an original 
binate function prior to implementation. Datapath logic circuits such as adders, 
subtractors, comparators, and ALUs are good examples of applications where carry 
generation functions (i.e., unate functions) are extracted from a larger (often binate) 
function and implemented in high-speed circuit structures. 

To meet these and other objects and advantages of the present 
invention, a method having preferred and alternate embodiments is provided for 
synthesizing a representation of a circuit into a new representation having greater 
unateness. The method includes (i) partitioning a circuit representation to obtain 
a representation of at least one sub-circuit, (ii) recursively decomposing the 
representation of the at least one sub-circuit into a sum-of-products or product-of- 
sums representation having greater unateness than the representation of the at least 
one sub-circuit, and (iii) merging the sum-of-products or product-of-sums 
representation into the circuit representation to form a new circuit representation. 



The invented method may additionally include repeating steps (i), (ii) 
and (iii) until a desired level of unateness for the new circuit representation has been 
achieved. 

The invented method may additionally include, for each 
decomposition, selecting the sum-of-products or product-of-sums representation 
having fewer binate variables. 

The invented method may additionally include merging common 
expressions of the sum-of-products or product-of-sums representations. 

The invented method may additionally include implementing 
algebraic division to merge common unate expressions. 

The invented method may additionally include partitioning the circuit 
representation to obtain a representation of at least one sub-circuit that is highly 
unate. 

The invented method may additionally include implementing a binary 
decision diagram to recursively decompose the representation of the at least one 
sub-circuit into the sum-of-products or product-of-sums representation. The binary 
decision diagram may be a zero-suppressed binary decision diagram. 

Additionally, a system having preferred and alternate embodiments 
is provided for synthesizing a circuit representation into a new circuit representation 
having greater unateness. The system comprises a computing device configured to 
(i) receive input defining a circuit representation, (ii) partition the circuit 
representation to obtain a representation of at least one sub-circuit, (iii) recursively 
decompose the representation of the at least one sub-circuit into a sum-of-products 
or product-of-sums representation having greater unateness than the representation 
of the at least one sub-circuit, (iv) merge the sum-of-products or product-of-sums 
representation into the circuit representation to form the new circuit representation, 
and (v) output the new circuit representation. 



The computing device may be further configured to receive input 
defining a desired level of unateness for the new circuit representation, and repeat 
steps (ii), (iii) and (iv) until the desired level of unateness is achieved. 

The computing device may be further configured for each 
decomposition, select the sum-of-products or product-of-sums representation having 
fewer binate variables. 

The computing device may be further configured to merge common 
expressions of the sum-of-products or product-of-sums representations. 

The computing device may be further configured to implement 
algebraic division to merge common expressions. 

The computing device may be additionally configured to partition the 
circuit representation to define a representation of at least one sub-circuit that is 
highly unate. 

The computing device may be additionally configured to implement 
a binary decision diagram to recursively decompose the representation of the at least 
one sub-circuit into a sum-of-products or product-of-sums representation. The 
binary decision diagram may be a zero-suppressed binary decision diagram. 

The circuit representation and the new circuit representation may be 
input to the computing device and output from the computing device, respectively, 
in a hardware description language such as Verilog or VHDL. 

Additionally, a preferred computer-readable storage embodiment of 
the present invention is provided. In accord with this embodiment, a computer- 
readable storage medium contains computer executable code for instructing one or 
more computers to (i) receive input defining a circuit representation, (ii) partition 
the circuit representation to obtain a representation of at least one sub-circuit, (iii) 
recursively decompose the representation of the at least one sub-circuit into a sum- 



of-products or product-of-sums representation having greater unateness than the 
representation of the at least one sub-circuit, (iv) merge the sum-of-products or 
product-of-sums representation into the circuit representation to form a new circuit 
representation, and (v) output the new circuit representation. 

The computer executable code may additionally instruct the 
computer(s) to receive input defining a desired level of unateness for the new circuit 
representation, and repeat steps (ii), (iii) and (iv) until the desired level of unateness 
is achieved. 

The computer executable code may additionally instruct the 
computer(s) to, for each decomposition, select the sum-of-products or product-of- 
sums representation having fewer binate variables. 

The computer executable code may additionally instruct the 
computer(s) to merge common expressions of the sum-of-products or product-of- 
sums representations. 

The computer executable code may additionally instruct the 
computer (s) to implement algebraic division to merge common expressions. 

The computer executable code may additionally instruct the 
computer(s) to employ a binary decision diagram to recursively decompose the 
representation of the at least one sub-circuit into the sum-of-products or product-of- 
sums representation. 

The above objects and advantages of the present invention are readily 
apparent from the following detailed description of the preferred and alternate 
embodiments, when taken in connection with the accompanying drawings. 
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FIGURES la-lc are functional circuit diagrams illustrating (a) a 
single binate block with three functions, (b) two unate blocks with six functions, and 

(c) three unate blocks with fifteen functions; 

FIGURE 2 is a block representation corresponding to a 
decomposition offtX) = Kg^X), g 2 (X), g k (X}) in accordance with the present 
invention; 

FIGURES 3a-3b are (a) a binary tree representing an AND-OR 
decomposition in accordance with the present invention and (b) a corresponding 
block representation in accordance with the present invention; 

FIGURE 4 is a block representation corresponding to a £-level 
AND/OR tree in accordance with the present invention; 

FIGURE 5 is an example gate-level circuit to be decomposed by the 
computer-implemented process UDSYN in accordance with the present invention; 

FIGURES 6a-6i are (a) a circuit graph G c , for the circuit of Figure 
5, (b) partition G P1 selected from G C1 , (c) unate decomposition performed on G P1 , 

(d) graph G cn obtained by merging B 2 and G C1 , (e) partition G n selected from G^, 
(f) final unate decomposition of G n , (g) circuit graph G C3 obtained by merging B 3 
and G C2 , (h) partition G P3 selected from G C3 , and (i) final block representation for 
the circuit of Figure 5 in accordance with the present invention; 

FIGURES 7a-7c illustrate a zero-suppressed binary decision diagram 
(BDD) representing f 4 SOP - ab\ cb : (a) a path representing cube ab; (b) 
a path representing cube cb ; and (c) a zero-suppressed BDD representing 
f 4 pos = (a+ c)(a + b)(b + c) in accordance with the present invention; and 
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FIGURES 8a-8d illustrate four steps in partitioning G cx of Fig. 6a: 
(a) start with a primary input node nlO and select n7; (b) select nl4 and its 
transitive fanin nodes; (c) select nl8 and its transitive fanin nodes; (d) final partition 
G Pl in accordance with the present invention. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The present invention comprises a method and system having 
preferred and alternate embodiments for efficiently synthesizing a representation of 
a circuit into a new representation having greater unateness. 

For purposes of illustration, the present invention is described in the 
context of Bodean function-based digital circuitry. However, application of the 
present invention is not so limited. Notably, the present invention may be applied 
to a variety of circuit representations such as gate-level circuits, PLA 
representation, transistor-level representations, and HDL-based circuits. 

Fig. la defines a circuit representation 10 with three binate outputs 
(e.g., x, y and z). Figure lb shows a two-block representation for circuit 10, and 
Fig. lea three-block representation. The block representations of (b) and (c) are 
obtained by decomposing the functions defining the original single-block circuit 
representation 10. Circuit functionality for each block is represented in a sum-of- 
product (SOP) or product-of-sum (POS) form. Table 1 compares the test 
requirements for the three representations of Fig. 1. 



TABLE 1 





Test Requirements 


Implementation Flexibility 


Block 
Representation 


No. of Binate 
Variables for 
all Functions 


Universal 
Test Set 
Size 


No. of 
Block 
Functions 


Area of an Example 
Synthesized 
Circuit 


Figure la 


7 


64 


3 


28 


Figure lb 


0 


33 


6 


29 


Figure lc 


0 


47 


12 


50 



In the case of Fig. la, function z is binate for all six variables, so its 
universal test set consists of all 64 possible test vectors. The decomposed designs 
of Figs, lb and c require smaller universal test sets (33 and 47, respectively), since 
all their internal blocks are unate. Although the difference in test set size is minor 
in this small example, it tends to be significant for larger circuits. 

Circuits that do not have natural block representations are often 
implemented by logic synthesis systems, while those with natural block 
representations are often implemented manually. Although the present invention is 
not limited to the former case, we assume that the target circuit is of the former type 
in order to best describe the present invention. 

For a given circuit, the block representations created in accordance 
with the present invention can be considered as design constraints. In other words, 
the boundaries of the blocks serve as a high-level structural design constraint that 
must be satisfied by low-level implementations of the circuit such as gate-level or 
transistor-level implementations. Notably, these design constraints tend to restrict 
implementation flexibility. 

The outputs of any blocks in a circuit C's block representation 5 B 
define the block functions. Circuit C's implementation flexibility can be roughly 
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measured by the number of block functions that C employs. This follows from the 
fact that a large number of block functions in S B implies that the functions 
themselves are small. Block functions in different blocks cannot be merged, so 
implementations of small block functions are generally less flexible than large ones. 
Thus the fewer the block functions in S B , the higher the implementation flexibility 
of C. 

For example, Table 1 compares the implementation flexibility of the 
block representations in Fig. 1. Figure lb has three more block functions than Fig. 
la, while Fig. Ic has 12 more. Whereas Fig. la has full implementation flexibility, 
the other block representations have limited flexibility. The last column of Table 1 
compares the area of some example implementations synthesized by a commercial 
synthesis system Synopsys® Design Compiler with the goal of reducing area. The 
area is calculated from the relative gate areas defined in the Synopsys cell library; 
for example, inverter = 1, AND2 = 2, AND4 = 3, OR2 = 2, and OR4 = 3. In 
summary, this example suggests that lower implementation flexibility often leads 
to poor implementations in terms of circuit area. 

To permit a broad range of implementation styles, the invented 
synthesis process attempts to decompose a binate function into as small a set of 
unate subfunctions as possible. In general, a decomposition of function /can be 
expressed as: 

AX) = h(g 1 (X),g 2 (X),...,g k (X)) 

Let/be the root function, subfunction h the parent function, and each 
subfunction g t a child function. X = {x { , x 2 , x n } denotes the support set of/ and 
each g;. A decomposition is called a unate decomposition if all the subfunctions h 
and g Vk infiX) are unate. A decomposition of the form/;*) transforms a single- 
block model of function /into a two-block model with h defining one block and g Vk 
the other; see Figures 2 and 3. 
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In accordance with the present invention, a preferred method for 
synthesizing circuits utilizing unateness properties of boolean functions involves 
recursive OR and AND decompositions of a circuit output function / and its 
subfunctions. The OR and AND decompositions represent J{X) by higiiX), g 2 (X)), 
where h has the form g x + g 2 and g, g 2 , respectively. We obtain an OR 
decomposition off from an SOP form of/, and an AND decomposition from a POS 
form. 

For example, consider a binate function /, whose SOP form is 
ab+ac+ca. A possible OR decomposition off is h = g x + g 2 with g x = ac 

and g 2 = ab + ca . This decomposition makes all the subfunctions h, g x , and g 2 

unate, and so is a unate decomposition. Now consider /j's POS form 
(a + c)(a + b + c) . We can obtain directly from this form an AND 

decomposition with subfunctions h = g } g 2 , g x = a + c, and g 2 = a + b + c . 

This is also a unate decomposition. 

A single OR or AND decomposition of a large binate function may 
not lead to a unate decomposition. However, a sequence of OR or AND 
decompositions applied to /recursively always produces a unate decomposition for 
any function /. The general form of such a sequence with k levels is 

f=h\ gl \g 2 ') 
ft 1 = gi) 

where H and gj denote a parent function and a subfunction, respectively, produced 
by the j-th level decomposition. Parent function h! can be either AND or OR. A 
fc-level sequence of AND and OR decompositions forms a binary AND-OR tree, 
where the internal nodes represent AND or OR operations, while the leaf nodes 
denote unate subfunctions that need no further decomposition. 
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An arbitrary sequence of AND and OR decompositions can lead to 
an excessive number of subfunctions. To reduce this number, we restrict our 
attention to sequences of the following type, which we refer to as unate AND-OR 
decompositions. 

f=h\g u \gj) 
gj = h 2 (gu\ g b 2 ) 

g b k ' x = h k (g u k , gf) 

As in the general case, li is either AND or OR, but the final g b k and 
every gj are unate, while every gj except the final one is either unate or binate. 
This decomposition can also be represented in the compact factored form 

f=h\g u \ h\g u \ h\g u \ ... , /r^^ #(&*,&*))...))) (1) 
as well as the general form 

AX) = h(g u \g u \ ...,&,*,&*). (2) 

Comparing (1) with (2), we see that the parent function h in (2) is 
composed of the AND and OR subfunctions h x , h 2 , h k only, and so is always 
unate. 

Figure 3a shows a binary AND-OR tree corresponding to the unate 

decomposition 

/= (s;-q? u 2 + (g u 3 + (^ u 5 -g b 5 ))))) 

obtained by a unate AND-OR decomposition with 5 levels. The internal nodes 30a- 
30e in Fig. 3a represent AND or OR operations, while the leaf nodes 32a-32f at the 
bottom represent unate subfunctions. Figure 3b depicts the block representation 
corresponding to Figure 3a. Bj defines an AND-OR tree network that implements 
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the function h. B 2 is a network of undefined internal representation that implements 
the unate subfunctions g u \ g u 2 , g u \ g u \ g u \ and g b 5 . 

In general, we obtain the foregoing kind of unate AND-OR 
decomposition for /as follows: first decompose/ into g u l and g b l using an AND or 
OR operation that makes g„ ] unate; then repeatedly decompose gj into gj +1 and gj +1 
in a similar way, until g b J+l becomes unate. This must eventually happen, because 
a gj of a single product or sum term is unate. In practice, the AND-OR 
decomposition process often terminates with a final gj consisting of a relatively 
large unate function. 

As noted above, the global parent function h(g u \ g u k , gft in (2) 
is unate. Thus, the final result of a £-Ievel AND-OR decomposition is a set of k + 
2 unate subfunctions g u Xk , g b k , and h(g u \ g u k , g b *). 

Notably, an important goal of the block synthesis method shown in 
Figure 3 is to find an AND-OR decomposition of a given function /using as few 
subfunctions as possible. In addition, it is preferred that each gj be selected in a 
manner that makes the resulting gj highly unate. This selection often leads to a 
unate decomposition involving few subfunctions. Also such a gj can be relatively 
easily derived from a standard SOP or POS form. 

Each level of a unate AND-OR decomposition is defined by either 
an AND or OR operation. How we select the operation at each level has a large 
impact on the final result, as we show with the following example. 

Consider^ = a e b © c whose SOP and POS forms are given below. 



f 2 = abc+ abc+ abc+ abc (3) 
f 2 POS = (a+ b+ c)(a+ b+ c)(a+ b+ c)(a+ b+ c) (4) 
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OR decompositions are derived from (3), and AND decompositions 
are derived from (4). Suppose we select an OR operation in every level of the 
decomposition. A possible result is: 



fi = 8u l + (g u 2 + igu + g b 3 )) 



which involves five unate subfunctions: 



Su = a } c _ 

gu = abc 
gu = ahc 

g b 3 = abc ; and 

h(gu ->gu igu 'gb ) • 

Note that in this particular example, the unate decomposition is 
completed when the final g b k is a product term, and the resulting g 1 subfunctions 
correspond to each of the product terms in (3). 

Next, suppose we select an AND operation in every level. A 
possible result is the unate decomposition^ = gj -(g u 2 -(g u 3 -g b 3 )), which involves 
five subfunctions: 



gj = a + b+c; 
g u 2 = a + b+c; 
g u 3 = a+b+c; 
g b J = a + b + c ; and 
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Notably, a unate decomposition of f 2 can be obtained involving fewer 
subfunctions if AND and OR operations are mixed as follows. Suppose we select 
an OR operation in the first level and an AND operation in the second level. The 
OR operation decomposes (3) into f 2 x = g u x + g b \ where g J = ale and 
gb l = abc+ Zbc+abc . To apply an AND operation to g b \ we use g b l 's POS form 
(a + b)(a + c)(b + c)(a + b+c) . Then an AND operation leads to g b x = g u 2 - g b 2 , where 
for example, gu 2 = (a + b + c) and gb 2 = (a + b)(a + c){b + c) . Since g b 2 is unate, 
the unate decomposition is complete. Note that unlike the previous cases, the final 
g b k here contains more than one term. The third unate decomposition of f 2 is 

f 2 = Su + iSu ■g b 2 )=abc+(a+b+c)-(a+ b)(a + c)(b + c) 

which involves only four subfunctions, one less than the first and second cases, 
where we selected three AND and three OR operations, respectively. This example 
shows that how we select the AND-OR operation in each level of the AND-OR 
decomposition is very important. 

Often, there are many possible AND and OR decompositions in each 
level. This implies the existence of a large number of possible unate AND-OR 
decompositions. For example, if an SOP form of g b j contains m product terms, we 
can partition these terms into two groups defining gj +l and g b J+l in 2 m different 
ways. At each level, either an AND or OR operation can be chosen, so the number 
of possible fc-level unate AND-OR decompositions is 2 m+k . Thus, finding a unate 
AND-OR decomposition of a large function / involving a minimal number of 
subfunctions is often impractical. We therefore introduce Unate-Decomp, a 
heuristic process that systematically selects AND or OR decompositions at each 
recursion level, and produces a final unate AND-OR decomposition containing 
relatively few subfunctions. 

Unate-Decomp represents a function /and all its subfunctions in both 
SOP and POS forms. To produce an AND (OR) decomposition off, it selects a set 
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S of product terms (sum terms) from the SOP (POS) form of /, so that S constitutes 
a unate subfunction g u . The rest of the product terms (sum terms) of / define 
subfunction g b . Unate-Decomp then represents g b by both SOP and POS forms, 
which it uses to produce an OR and AND decomposition at the next recursion level. 
To represent SOP and POS forms efficiently, binary decision diagrams can be 
employed. 

To decompose /into as few unate subfunctions as possible, Unate- 
Decomp produces each gj in a way that reduces the number of binate variables in 
g b j . In the case of multiple- function circuits, Unate-Decomp first decomposes each 
output function / using the method described above. It then merges common 
subfunctions of different functions /• and / to reduce the total number of 
subfunctions. 

Notably, representing large circuits directly by two-level expressions 
is often inefficient. To handle such cases efficiently, a preferred process first 
partitions a given circuit, and then performs decomposition on each partition. For 
example, one partition is created for each process of Unate-Decomp parent function 
h. The resulting decomposition is then merged into the rest of the circuit. Then the 
next partition is created from the merged circuit, and the next process of Unate- 
Decomp is conducted. This process is repeated until no more partitioning is 
necessary. 

Preferably, each decomposition step and circuit partition are selected 
in a way that produces a small number of highly unate subfunctions. Consequently, 
the resulting block representations tend to have a high level of implementation 
flexibility. 

To decompose /into as few unate subfunctions as possible, Unate- 
Decomp produces each gj in a way that reduces the number of binate variables in 
g b J . For example, consider the following function / 3 : 
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f 3 = abc + ad + ae + bcf + bdf + be + cdf + ce + abdf 
+ abe + acdf + cdf + cd + de 

Suppose we decompose^ into g u + g b . Table 2 shows some 
possible ways of doing this and the number of binate variables in the resulting g b . 



TABLE 2 



gu 


gb 


JNo. or 
Binate 
Variables in 

gb 


cd + acd f + bd f 


ad+abe+bcf+cdf+abc+abdf 
+ ae+de+cd f + be+ ce 


6 


abd f + ce 


ad+ abe+ be f + cd f + abc + ae 
+ de + cdf + be+ cd + acd f + bdf 


5 


bcf + bdf + cdf + be + 
ce + abc + ae 


cd + acdf + abe + cdf 
+ ad + abd f + de 


3 


abc + ad + ae + bcf + b 
df + be+ cdf + ce 


abdf + abe+ acdf 
+ cdf + cd-t- de 


2 



While the first OR decomposition produces g b with six binate 
variables, the last OR decomposition produces g b with only two binate variables, 
and so is selected. 
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In each level of the decomposition process, Unate-Decomp produces 
a pair of AND and OR decompositions using a special form of cofactor operation 
called Subset. The Subset operation for a literal /, extracts a subset S of product 
(sum) terms of a given SOP (POS) form by eliminating terms that contain /,. For 
example, applying Subset to the SOP form 

abc + acd + abd 

for literal 

a 

yields 

S = acd + abd 

Unate-Decomp systematically computes S for a set of binate literals 
{1} so that S is unate and the set of other terms is highly unate. Then S defines gj, 
and the other product terms define gj. 

After a unate decomposition is formed, Unate-Decomp constructs two 
blocks from an AND-OR tree representing the decomposition; see Fig. 4. To 
ensure that all the block functions are unate, we place in block all the nodes 
representing the subfunctions g u u k and gj, which correspond to the leaf nodes in the 
AND-OR decomposition tree. We place in block B 2 all the other nodes, which 
represent AND and OR operations and together form the function h. 

In the preceding description, we focused on decomposing a single 
function. In the case of multiple- function circuits, Unate-Decomp first decomposes 
each output function/ using the method described above. It then merges common 
subfunctions of different functions f { and fj to reduce the total number of 
subfunctions. Algebraic division operations are often employed by logic synthesis 
techniques to efficiently combine common expressions. These operations can be 
easily applied to the results of our AND-OR decompositions, and often reduce the 
number of subfunctions significantly. Notably, Unate-Decomp incorporates 
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algebraic division in such a manner that two different functions share the divisor of 
each division. 

Based on the unate decomposition concept described above, we 
introduce a computer-implemented synthesis program in accordance with the present 
invention called Unate Decomposition Synthesis (UDSYN). Representing large 
circuits directly by two-level expressions is often inefficient. To handle such cases 
efficiently, UDSYN first partitions the given circuit, and then performs 
decomposition on each partition, as generally described above. 

Table 3 contains one embodiment of a pseudocode representation of 
UDSYN in accordance with the present invention. Notably, it is understood by those 
of ordinary skill in the art that different computer programs and program 
arrangements can be implemented to support and execute the overall function of 
UDSYN. 

TABLE 3 



Embodiment of process UDSYN (Verilog-input) 




1 : G c : =Build-Circuit-Graph(Veri\og-mput); 




2: while (G c * 0) begin 




3: G P := UD-Partition(G c ); 


/* G P is a graph representing a partitioned block */ 


4: G c := G c - G P ; 


/* Remove nodes in G P from G c */ 


5 : for each output node n R in G P begin 




6: Build-ZSBDD(n s , G F ); 


/* Create SOP and its complement for n R */ 


7: end; 




8: (#,, B c+l ) := Unate-Decomp(G P ); 


/* 5, and B i+1 correspond to fl, and Z? 2 of Fig. 7 */ 


9: G c := G c u fl i+1 ; (':=/ + 1; 


/* Insert notes in into G c */ 


10: end; 




1 1 : Verilog-output : = Interconnect-Blocks{{B^)\ 


/* Verilog-output is the final block representation */ 


12: return Verilog-output; 





UDSYN takes an input circuit in a form such as a Verilog 
specification whose internal elements can be either functional descriptions or gates. 
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First, UDSYN builds a circuit graph G c whose nodes represent the internal elements. 
It then creates a partition G P of G c using UD-Partition(G c ), and removes nodes in 
G P from G c . The output functions of G P are represented in SOP and POS forms. 
The process Unate-Decomp{G ? ) performs unate AND-OR decompositions on the 
output nodes in G P , and constructs decomposed blocks B x and B 2 as in Fig. 4. 
Blocks B x and B 2 created from the i-th partition G Pi are denoted by B y and 
respectively. Step 9 modifies G c by inserting all nodes of B 2 into G c . UDSYN 
repeats the above steps until all nodes in G c are removed. It then constructs a 
hardware description language (e.g. Verilog, VHDL, etc.) output file by specifying 
the interconnections among all blocks B r 

We illustrate UDSYN using a gate-level circuit of Fig. 5 as input. 
Figures 6a to i show intermediate results of steps 2 to 10 in Table 3. Figure 6a 
shows the circuit graph G C1 for the circuit of Fig. 5; each node in G C1 corresponds 
to a gate in the circuit. UD-Partition(G c ) creates a partition G P1 starting from the 
primary inputs of G C1 . The shading in G cl indicates nodes that are selected by UD- 
Partition(G c ), and constitute G P1 . Figure 6b represents G P1 by a rectangle. All 
nodes in G P1 are removed from G C1 , and are merged into SOP and POS forms by 
steps 5 to 8. These SOP and POS forms are decomposed by step 8 into unate 
subfunctions; these subfunctions are grouped into two blocks fij and B 2 as in Fig. 
4. As Fig. 6c shows, 5, consists of seven subfunctions and B 2 consists of three 
subfunctions. We create a new circuit graph G C2 by merging B 2 and G c , as shown 
in Fig. 6d. Returning to step 3, UD-Partition(G c ) selects some nodes (shaded) in 
G C2 and creates a new partition Gp2, which is represented by a rectangle in Fig. 6e. 
Then step 8 decomposes Gpj into blocks B 2 and B 3 appearing in Fig. 6f. Step 9 
merges B 3 into a new circuit graph G C3 as in the preceding steps; see Fig. 6g. 
Figure 6h shows a new partition G P3 constructed from G C3 . By repeating this 
process, we finally obtain the decomposed block representation of Fig. 6i consisting 
of five blocks B l :5 . The output functions of these blocks are described by Verilog 
code in equation form. If UD-Partition(G c ) constructs k partitions, UDSYN 
produces a total of k + 1 blocks. 
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Step 6 of Table 3 uses a type of binary decision diagram (BDD) 
called a zero-suppressed BDD (ZSBDD) to represent the SOP and POS forms of 
functions. Although other forms of BDD can be used, we limit our attention in this 
description to ZSBDDs for the sake of presentation. A ZSBDD of a function / is 
a graph whose paths denote the product terms (cubes) in an SOP form of /. 
UDSYN uses two ZSBDDs to represent a pair of SOP and POS forms for the 
internal function of each node in an AND-OR tree like that in Fig. 4. Thus an 
AND-OR tree with n nodes is represented by In individual ZSBDDs, each of which 
is linked to the corresponding node in the tree. 

For example, Figs, la and b show a ZSBDD representing 
= ab+ ch . The internal nodes (circles) in Figs, la and b denote the literals 
appearing inf 4 sop . The terminal or leaf nodes (rectangles) denote the output values 
thatf/ op generates when its literals are set to the values specified on the edges. The 
name "zero-suppressed" stems from the property that all nodes in a ZSBDD whose 
1-edge points to the 1 -terminal node are eliminated. Every path from the root to the 
1-terminal node represents a cube inf 4 sop . For example, the path highlighted by the 
dashed line in Fig. la represents cube ab, while the one highlighted in Fig. lb 
represents cube cb . Although ZSBDDs can represent only SOP forms directly, 
POS forms can also be handled by their complemented form. 

For example, consider the POS expression 

f/ os = (a+cXa + b)(b+'c) 

having the following complement 

f POS = ac+ab + bc 

Figure 7c shows a ZSBDD that represents / t pos , where every path 
from the root to the 1-terminal node represents a sum term inf 4 POS with their literals 
complemented. In this way, we can represent both SOP and POS forms using 
ZSBDDs. 



-24- 



ZSBDDs have been shown to represent large functions efficiently. 
This is due to the fact that small ZSBDDs can contain a large number of paths, so 
a large number of cubes can be often represented by a compact ZSBDD. ZSBDDs 
also support fast manipulation of sets of cubes such as finding subsets, computing 
the intersection of cube sets, and performing a type of division operation on cube 
sets. Utilizing these features, we implement unate AND-OR decomposition and 
division processes that act directly on ZSBDDs. 

UD-Partition(G c ) creates a partition of the input circuit in a way that 
makes the functions defining the partition highly unate, while meeting a specified 
partition size limit. Partitions created in this way simplify the unate decomposition 
process. One pseudo-code embodiment of UD-Partition appears in Table 4. 
Notably, it is understood by those of ordinary skill in the art that a variety of 
different computer programs and program arrangements can be implemented to 
support and execute the inventive function of UD-Partition. 

TABLE 4 

Embodiment of process t/D-Partition(G c ) 
1 : for each n c in G c in level order begin 



2: for each fan in node n, of n c begin 
3: if (n, is a primary input of G c ) then 

4: S supp (n c ) : = S supp (n c ) u w,; /* S supp (n c ) is n c 's support set */ 

5: else 

6: S supp (n c ) : = S supp (n c ) u S supp (n,)\ 

1: Calculate-Patfi-Count(S supp (n c )); I* Add path count of fan-in nodes to n c */ 

8: end; 

k 

9 : N BV (n c )='^ J Binate(s t ,n c ) I* Binate(s t , n c ) = 1 if Sj is binate for n c *l 

i~l 

10: end; 

1 1 : while (G c *0) begin 

12: n m : = Select-Node-of-Min-N BV (G c ); I* n m is to be included in the partition */ 

13: S N : = nodes in n m 's fan-in cone in G c ; 

14: if (I/0-count(G P u S N ) < threshold) then /* G F is the graph for the partition */ 

15: G P : = G P u S N \ G c := G c - G P \ I* Add n m and its transitive fan-in nodes to G P *l 

16: else break; /* Discard the candidate node n m */ 

17: end; 

18: return (G c , G P ); 
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Steps 1 to 10 compute the number of binate support variables of each 
node in the circuit graph G c . Steps 11 to 17 create the current partition G ? by 
selecting nodes in G c that have a minimal number of binate variables. 

A node n c in G c is unate with respect to a primary input s it if all paths 
between n c and s t have the same inversion parity; otherwise, n c is binate with respect 
to s r To determine the inversion parity of the paths, we calculate the number of 
paths from the primary inputs to each node n c in G c . Let p even (Si, n c ) and p^iSi, n c ) 
be the number of paths from a primary input s { to a node n c whose inversion parity 
is even and odd, respectively. Steps 3 to 7 find the set S^in^ of support variables 
for each node n c . For n c and its fanin nodes n n Calculate-Path-Count obtains p^iSi, 
n c ) and p^dis,, n c ) by recursively computing 

Pe\en(S\' "c) Peven(^i> "c) Peven(^i) ^i) 

Pout*, n c ) = PoddCJi, n c ) + Prt^, n) 
if the inversion parity from n x to n c is even; otherwise, it computes 

Peven( S i> n c) = Peven( S i> n c) + Podd( s i> n i) 
Poii( s i> n c) = PoddiSi, «c) + Avenfai. 

The binary function Binate(s h n c ) produces 1 (0), if a node n c is 
binate (unate) with respect to its support variable s { , and is computed by 

Binate(Si, n c ) = 0, if/? eve n(^, « c ) = 0 or/i^fo, « c ) = 0 
Binate(s { , n c ) = 1, otherwise 

The number N m (n<.) of binate variables of node n c with k variables is 

defined as 

k 

N Bv(. n c) = Yj Bin(lte( < S i> n c) 
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The intuition behind using N BV in c ) to guide the partitioning stems 
from the fact that the more binate the node n c , the more difficult the decomposition 
process for n c tends to be. Steps 1 to 10 traverse every node only once, which has 
complexity 0(N). They propagate p^is^ n c ) and p^iSi, n c ) for every s { for a n c to 
the nodes in the transitive fanout of n c , which also accounts for complexity 0(N). 
Hence the overall complexity of computing N BV (n c ) for all nodes in G c is 0{f^). 

For example, Table 5 shows the calculation of /V BV (n c ) for every node 

in Fig. 6a. 



TABLE 5 



Node 


No. of paths from n c 's support variable j, 


No. of binate 
variables 

N B ^n c ) 


n5 


(a, 0, 1), (ft, 0, 1) 


0 


nlO 


(ft, 0, 1), (d, 0, 1) 


0 


n3 


if, 1,0), (d, 1.0) 


0 


n\2 


(g, 0, 1) 


0 


nl3 


(h, 1, 0), (i, 1, 0) 


0 


n8 


(c, 0, 1), (6, 1, 0), (d, 1, 0) 


0 


nl 


(g, 1,0), <f,0, 1), (d, 0, 1) 


0 


«14 


(ft, 1,0), (/, 1,0), (f, 1,0), (d, 1,0) 


0 


«9 


(c, 1, 0), (ft, 0, 1), (d, 0, 1) 


0 


nil 


(b, 1, 1), (d, 1, 1), (c, 1,0), («, 0,.l) 


2 


«4 


(c, 1, 1), (ft, 1, l),(d, 2,2), (g, 1, 1), (f, 1,1) 


5 


«2 


(c, 0, 1), (b, 2, 0), (d, 1,0), (a, 1,0) 


0 


n6 


(a, 1, l),(ft, 2,2), (c, 1, 1), (d, 1, 1) 


4 


nl8 


(c, 1, 1), (ft, 1, 1), (d, 3, 2), (g, 1, 1), (/", 2, 1), (A, 1, 0), (i, 1, 
0) 


5 


nl5 


(c, 2, 2), (ft, 4, 4), (d, 3, 3), (a, 1, 1), (e, 1, 1) 


5 


nl 


Cf.O, l),(d, 1,2), (a, 1, 1), (6, 2, 2), (c, 1, 1) 


4 



-27- 



Node 


No. of paths from n c 's support variable Sj 


No. of binate 
variables 


nl6 


(f, 3, 3), (rf, 7, 7), (a, 2, 2), (b, 6, 6), (c, 4, 4), (g, 2, 2) 


6 


nil 


(5, 5, 5), (f, 7, 7), (rf, 15, 15), (a, 4, 4), (b, 12, 12), (c, 8, 8) 


6 



The second column lists p^is,, n c ) and Pevenfai. n c) computed for each 
node n c and all its support variables. The last column gives N BV (n c ). For example, 
for n c = nil, Binate{b, nil) = 1, Binate{d, nil) = 1, Binate(c, nil) = 0, and 
Binate(e, nil) = 0. Thus N BV (n c ) = 1 + 1+ 0 + 0 = 2. 

After N BV (n c ) is computed for every n c in G c , UD-Partition selects 
from G c a node n m of minimal N BV (n c ) starting from a primary input of G c . It then 
inserts into G P all non-partitioned nodes in the transitive fanin region of n m . This 
process is repeated until the size of G P exceeds a threshold equal to the maximum 
number of G P 's I/O lines. By limiting the partition size in this way, we can prevent 
ZSBDDs from exploding for large circuits, while producing a partition with highly 
unate output functions. 

Figure 8 illustrates how we partition G c of Fig. 6a. Suppose we limit 
Gp's I/O lines to seven inputs and six outputs, that is, we set the threshold to 7/6. 
The N By (n c ) values calculated in Table 5 are shown next to each node n c in Fig. 8. 
Figures 8a to d indicate the current G P created in each iteration by shading, and 
newly selected nodes by thick circles. The first n m is selected from the candidate 
nodes ri3, n5, nS, nlO, nil, nl2, and nl3, which are adjacent to the primary inputs. 
We select n3 whose N m (n c ) has the minimum value 0, and add it to G P ; see Fig. 8a. 
The next search begins from n3 and selects n3's fanout node nl whose N BV (n c ) = 
0. We then select all nodes in the transitive fanin region of nl; Fig. 8a indicates 
these selected nodes by a dashed line. Figure 8b shows the current G P consisting 
of n3, nil, and nl. We then select nl4 over nl, nA, and nil, and then select nl4's 
fanin node nl3; the newly selected nodes are again indicated by a dashed line in 
Fig. 8b. We next select nil over nl8, nl, nA, but nil leads to a partition with 
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seven outputs, one greater than the limit six. Hence we select nl8 instead which has 
the next smallest N BV (n c ). We then select nodes in «18's transitive fanin region; see 
Fig. 8c. At this point, the number of I/O lines of G P equals the threshold 7/6, so 
the partitioning is done. Figure 8rf indicates the final G P by shading. 

Since UD-Partition selects nodes with fewer binate variables first, it 
often leads to a partition where many output functions are already unate and so 
require no further decomposition. For example, in G P of Fig. 8d, four output 
functions at n3, nl , nS, and nlO are unate. Figure 6c shows a unate decomposition 
of this G P , where nodes g3, gl, gS, and glO in D x correspond to these four unate 
functions, and so are not decomposed. 

Next we describe Unate-Decomp(G) which systematically applies 
unate AND-OR decomposition operations to a circuit partition. See Table 6 for one 
pseudo-code embodiment of Unate-Decomp(G) . 



TABLE 6 



Embodiment of process Unate-Decontp(G) 



1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 



S B : = G's binate function nodes; 
while (S B * 0) begin 

for each node n B in S B begin 

(«„, n b ) : = ANDOR-OneLevel{G, n B ); 
S D := S D u {n u , /!„}; 
end; 

for each node n d in S D begin 

for each node rij in S B - S D begin 
(n q , n r ) : = Division(n fy n d ); 
if (N BV (n f ) < N B Jn q ) + N BV (n r )) then 
Reverse the division; 

end; 
end; 

S B := 0; S D := 0; 
for each node n, in G begin 
if (N BV (n t ) > threshold) 
Sb '■ — S B u /i, ; 

end; 
end; 

return G; 



I* S B stores nodes of binate functions to be decomposed */ 



J*n u («(,) points to subfunction g u in (3.4) */ 
/* S D stores candidate divisor nodes */ 



/* rif is a candidate dividend node */ 

/* n„ is the quotient and n r is the remainder */ 



/* Find new nodes to be decomposed */ 
/* n, exceeds the threshold */ 
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Graph G initially contains the nodes in the current partition. Steps 
3 to 6 perform a level of AND-OR decomposition on every binate node n B in G. 
Then, steps 7 to 13 perform division operations on every binate node in G by 
treating as divisors child nodes created by the AND-OR decompositions. N BV (n) 
denotes the number of binate variables in the subfunction at node n x in G. If a 
division reduces N BV (n), it is accepted; otherwise, it is discarded. The above 
process is repeated until all nodes in G become unate. For some large binate 
functions, forcing all nodes to be unate leads to an excessive number of 
subfunctions. We therefore stop decomposing a node n x if N BW (n^ becomes less than 
a certain threshold. This threshold is chosen to yield a small set of subfunctions at 
the cost of lower unateness. Thus the threshold allows us to trade the level of 
unateness for a higher implementation flexibility of the block representation. 

Table 7 contains a pseudo-code embodiment of the computer- 
implemented process ANDOR-OneLevel{G, n B ), which implements one level of the 
AND-OR decomposition technique described earlier. 

TABLE 7 

Embodiment of process ANDOR-OneLevel(G, n B ) 

sop sop 

1: (Su >§b )■= Find-Unate-Cube-Set (SOP(n B )); I* OR decomposition */ 

CPOS CPOS 

2: iSu >6 ) ■= Find-Unate-Cube-Set (Inv(SOP(n B ))); /* AND decomposition */ 

SOP CPOS 

3: if (N BV (g b )< N BV (g b )) then 

rfjp SOP SOP 

4: Replace-Node {n B , NewNodes (h ,g u ,g b )) ; /* Replace n B in G by the new nodes */ 
5: return (Node(g u s °' ),Node(g k s °' )) ; 
6: else 

7: Replace-Node (n g ,NewNodes(h pos ,lnv(g u cra Xlnvig^ ))) ; I* Inv(g) complements g *l 

— CPOS CPOS 

8: return (lVode(Inv(g u )),Node(Inv(g b ))) ; 



The process Find-Unate-Cube-Set(SOP(n B )) finds a set of unate cubes 
(product terms) from an SOP representation SOP(n B ) for node n B . This operation 
forms an OR decomposition. An AND decomposition is obtained by 
complementing the input SOP(n B ) and the outputs of Find-Unate-Cube-Set, 
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respectively. This enables ZSBDDs to handle both AND and OR decompositions, 
although ZSBDDs can only represent SOP forms directly. 

Table 8 contains a pseudo-code embodiment of the computer- 
implemented process Find-Unate-Cube-Set{ISOP) . 



TABLE 8 



Embodiment of process Find-Unate-Cube-Set (ISOP) 


/* ISOP is the initial SOP form */ 


1 


S besl : = SOP:= ISOP; 




2 


while (N BV (SOP)> threshold) begin 




3 


for each literal /, for binate variables in SOP repeat 




4 


S t := Subset(SOP, /,-); 


/* Remove all cubes containing /, */ 


5 


if (N^+N^ISOP - S,)<N B ^S besl )+N B ^ISOP 


- S besl )) then 


6 


Sbest '■ = ^i', 




7 


end; 




8 


«?/»: = S to ; 




9 


end; 




10: g u : = S besl ; 




11: g b := ISOP-S best ; 




12: return (g„, g b ); 





Find-Unate-Cube-Set{ISOP) derives a cube set S from/s initial SOP 
form ISOP so that S meets the threshold on N BV (S). As a result, 5 defines unate 
subfunction g u k in (1), while ISOP - S defines g b k . As discussed earlier, an exact 
method to find an optimum AND-OR decomposition of an m-term ISOP must 
examine up to 2 m possible AND decompositions. To avoid this and derive S 
efficiently, a type of cofactor operation is implemented which can simultaneously 
extract from ISOP multiple cubes (product terms) with a common property. This 
operation, denoted by Subset(SOP, I), removes from SOP all cubes that contain 
literal l v Thus /, does not appear in the resulting SOP form S t , while may appear 
in S,; hence S-, is unate with respect to I,. 

For example, consider a function^ whose SOP form is 

SOP = abc + acd + ad + bda + ed 
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Subset(SOP, d) removes cubes ad and bda which contain literal 

d , and yield 

Sj = abc + acd + ed 

The basic concept is to apply Subset(SOP, /;) to a set of binate literals 
in a way that makes both S { and SOP - S v highly unate. We found that for a binate 
l„ Subset(SOP, /j) often eliminates from SOP not only /; but also other binate 
literals. Hence we can often obtain a highly unate cube set S by repeating only a 
few steps of Subset{SOP, /j). The inner loop (steps 3 to 8) of Table 8 performs 
Subset(SOP, /j) for all binate literals l n and selects S best , (i.e., the S, having the 
minimum N BV (Sj + N BV (ISOP - Sj). The outer loop (steps 2 to 9) repeats this 
process recursively with S best in place of SOP until N BV (S bea ) becomes less than 
threshold. The final S^, defines g u , while ISOP-S best defines g 5 . 

To illustrate, consider an initial SOP form 

ISOP = abc + acd + ad + bda + ec + abde + ed 

Suppose that the threshold of N m is 0. Table 9 shows each step of the outer loop 
in its first iteration with ISOP assigned to SOP. 



TABLE 9 



SOP = ISOP = abc + acd + ad + bda \ ec+ abde + ed 


Binate 
literal 


Si 


ISOP - Si 


N BV iS t ) + 
N^ISOP-S,) 


a 


be + abde + ed 


abc + acd + ad + bda 


3 


a 


abc + acd + ad+ bda 
+ bc+ ed 


abde 


3 


b 


ad + bda + ed+bc 


abc + acd + abde 


2 


b 


abc + acd + ad + ed + abde 


bda+ be 


2 


c 


abc + acd + ad + bda + abde 


be 


2 
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SOP = ISOP = abc + acd + ad + bda + ec+ abde + ed 


Binate 
literal /, 


Si 


ISOP - Si 


N^ISOP-S.) 


c 


ad + bda + abde + be 


abc + acd 


3 


d 


abc+ ad + bda + be 


acd + abde 


3 


d 


abc + acd + ed + abde + be 


ad + bda 


3 



Each row in Table 9 shows S t and /SOP - $ obtained by Subset(SOP, 
/;) for binate literals i, of ISOP. Row 3 (i.e. binate literal b) gives 

the minimum N BW (S) + N BV (SOP - S) and so is selected. The selected S { is still 
binate, so the second iteration of the outer loop is performed with the $ assigned to 
SOP, see Table 10. Each row gives Subset(SOP, I) for binate literals /, = d and d 
in SOP. 



: - 
5 

a 

03 

p 



10 



15 



TABLE 10 



SOP = S besl = ad+ bda +ed+bc 


Binate 
literal /,• 


Si 


ISOP - Si 


N B ^S,) + 
N^ISOP-S,) 


d 


ad + bda + be 


ed + abc + acd + abde 


1 


d 


ed+bc 


ad + bda + abc 
+ acd + abde 


3 



The first row (i.e. literal d) of Table 10 gives the lower N BV (Si) + 
N BV (SOP - S,), and is selected. The selected S t now is unate and so the process is 
done. We finally obtain g u = ad+ bda \bc and 

g b - ed + abc + acd + abde . Since Find-Unate-Cube-Set{ISOP) aims 

20 to reduce both N^{S^) and N^(SOP - S { ), it tends to make g b highly unate as well. 
Observe that the g b produced in this example is unate for all but one variable (a). 

The Subset{SOP, I) operation conducted using an M-node ZSBDD 
has a complexity of 0(M). Find-Unate-Cube-Set{ISOP) for an ISOP with /V binate 
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0 

variables repeats the inner loop N 1 times. Hence the worst case complexity of Find- 
Unate-Cube-Set(ISOP) is O(A^M). Compare this with the complexity 0(2 m ) of an 
exact method discussed above; m is usually significantly greater than N and M. 
Thus the presented AND-OR decomposition process can generate highly unate gj 
and gi quite efficiently. 

While embodiments of the invention have been illustrated and 
described, it is not intended that these embodiments illustrate and describe all 
possible forms of the invention. Rather, the words used in the specification are 
words of description rather than limitation, and it is understood that various changes 
may be made without departing from the spirit and scope of the invention. 
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